Rational Learning of Mixed Equilibria
نویسنده
چکیده
This paper investigates the problem of policy learning in multi-agent environments using the stochastic game framework, which we brieey overview. We introduce two properties as desirable for a learning agent when in the presence of other learning agents, namely rationality and convergence. We examine existing reinforcement learning algorithms according to these two properties and notice that they fail to meet both criteria. We then contribute a new learning algorithm, adjusted policy hill-climbing, which achieves both properties and is based on a simple principle: \learn faster when losing, slower when winning." The algorithm is rational and we demonstrate it on several stochastic games, showing it to be capable of converging to mixed equilibria.
منابع مشابه
Stable near-rational sunspot equilibria
A new class of near-rational sunspot equilibria is identified in economies expressed as non-linear forward-looking models. The new equilibria are natural extensions of the usual sunspot equilibria associated to the linearized version of the economy, and are near-rational in that agents use the optimal linear forecasting model when forming expectations. A generic existence result is established....
متن کاملOn the Convergence of Genetic Learning in a Double Auction Market
We study the learning behavior of a population of buyers and a population of sellers whose members are repeatedly randomly matched to engage in a sealed bid double auction. The agents are assumed to be boundedly rational and they update their strategies by imitation of successful agents and innovation triggered by random errors or communication with other agents. This process is modelled by a t...
متن کاملLying for Strategic Advantage: Rational and Boundedly Rational Misrepresentation of Intentions
Starting from an example of the Allies’ decision to feint at Calais and attack Normandy on D-Day, this paper models misrepresentation of intentions to competitors or enemies. Allowing for the possibility of bounded strategic rationality and rational players’ responses to it yields a sensible account of lying via costless, noiseless messages. In some leading cases, the model has generically uniq...
متن کاملLying for Strategic Advantage: Rational and Boundedly Rational Misrepresentation of Intentions By
Starting from Hendricks and McAfee's (2000) example of the Allies' decision to feint at Calais and attack at Normandy on D-Day, this paper models misrepresentation of intentions to competitors or enemies. Allowing for the possibility of bounded strategic rationality and rational players' responses to it yields a sensible account of lying via costless, noiseless messages. In many cases the model...
متن کاملFair and Efficient Solutions to the Santa Fe Bar Problem
This paper asks the question: can adaptive, but not necessarily rational, agents learn Nash equilibrium behavior in the Santa Fe Bar Problem? To answer this question, three learning algorithms are simulated: fictitious play, no-regret learning, and Q-learning. Conditions under which these algorithms can converge to equilibrium behavior are isolated. But it is noted that the pure strategy Nash e...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000